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Abstract 

This paper provides a basic review of the concepts of confidence 
intervals, effect sizes, central and noncentral distributions. 
Specifically, the use of confidence intervals around effect 
sizes is discussed. A demonstration of the Exploratory Software 
for Confidence Intervals (ESCI) is given to illustrate effect 
size confidence intervals for the single and two-sample case. 
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Calculating Confidence Intervals for Effect Sizes 
Using Noncentral Distributions 
There is much debate regarding the utility of statistical 
significance testing as a means of testing research effects (cf. 
Harlow, Mulaik, & Steiger, 1997; Henson & Smith, 2000; Vacha- 
Haase, Nilsson, Reetz, Lance, & Thompson, 2000) . 

Recommendations have been made to use other statistical methods 
to evaluate data, such as the reporting and interpretation of 
effect sizes. Such practice is critical to good statistical 
methodology. As the American Psychological Association (APA) 
Task Force on Statistical Inference noted (Wilkinson & APA Task 
Force on Statistical Inference, 1999) , 

It is hard to imagine a situation in which a dichotomous 
accept-re j ect decision is better than reporting and actual 
p value or, better still, a confidence interval. Never use 
the unfortunate expression "accept the null hypothesis." 
Always provide some effect-size estimate when reporting a p 
value (p.599) . 

The Task Force went on to state, "Always present effect sizes 
for primary outcomes" (p.599, emphasis added). 

Influenced by the Task Force report, the fifth edition of 
the APA Publication Manual (APA, 2001) called the "failure to 
report effect sizes" a "defect in the design and reporting of 
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research" (p.5). Regarding effect size and strength of 
relationship the Manual stated. 

For the reader to fully understand the importance of your 
findings, it is almost always necessary to include some 
index of effect size or strength of relationship in your 
Results section. You can estimate the magnitude of the 
effect or the strength of the relationship with a number of 
common effect estimates... The general principle to be 
f ollowed...is to provide the reader not only with information 
about statistical significance but also with enough 
information to assess the magnitude of the observed effect 
or relationship, (pp. 25-26) 

The fifth edition Manual (APA, 2001) also made comment on 
the role of confidence intervals (CIs) in result interpretation: 
The reporting of confidence intervals (for estimates of 
parameters, for functions of parameters such as differences 
in means, and for effect sizes) can be an extremely 
effective way of reporting results. Because confidence 
intervals combine information on location and precision and 
can often be directly used to infer significance levels, 
they are, in general, the best reporting strategy. The use 
of confidence intervals is therefore strongly recommended . 
(p.22, emphasis added) 
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The recommendation of the APA Task Force for the 
interpretation of both (a) effect sizes and (b) confidence 
intervals leads quite naturally to a conclusion that (c) 
confidence intervals about effect sizes would be quite useful 
(Thompson, 2001) . The APA Manual (APA, 2001) also alludes to 
the possibility of CIs around effect sizes. 

A special section of Educational and Psychological 
Measurement (Vol. 61, No. 4) is devoted to CIs around effect 
sizes. In this issue, Cumming and Finch (2001) presented new 
software that illustrates the value of CIs around effects for 
single and multiple studies. The software runs under Microsoft 
Excel, is user friendly, is quite reasonably priced, and is 
called Exploratory Software for Confidence Intervals (ESCI, 
pronounced "esky") . 

This paper will show how CIs can be used around effect 
sizes, focusing on Cohen's d. However, unlike typical CIs around 
many other statistics such as the mean, CIs for standardized 
effects require the use of noncentral t distributions. 

Effect sizes 

Effect sizes refer to indices used to indicate the 
magnitude of an obtained result or relationship (Fraenkel & 
Wallen, 1996) . Two broad categories encompass effect size 
indices. The two broad categories are characterized by (a) 
directly examining differences between means (e.g., Cohen's d) , 
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or (b) how much of the variability in the dependent variable is 
accounted for by variation in the independent variable (s). 
Examples of variance-accounted-f or effects include eta squared, 
omega squared, multiple R squared, and adjusted multiple R 
squared. 

Cohen' s d 

Cohen's d is a mean or mean difference, standardized via 
division by the pooled standard deviation (Cumming & Finch, 

2001 ) . 

(J.1 - M.2 

Cohen' s d = 

CT 

The interpretation of mean differences using Cohen' s d is in 
terms of standard deviation units. According to Cohen (1988), d 
values of 0.2, 0.5, and 0.8 can be roughly regarded as small, 
medium, and large effects, although researchers too often 
rigidly employ these rules of thumb. 

For a single sample case the following formula is 
applicable to generate d from a t-value: 

t 

d = . 

■Tn 

The utility of Cohen' s d is expanded in relation to 
Pearson's r (point-biserial) by the following formula (given the 
two populations are of equal size) . 
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d 

r = 

yfd 2 +4 

Because d is a ratio of two quantities (mean difference and 
standard deviation) noncentral t distributions are appropriate 
to compute accurate confidence intervals about standardized 
effect sizes (Cumming & Finch, 2001) . 

Central and Noncentral t Distributions 

Prior to discussing CIs about standardized effects it is 
necessary to address central and noncentral t distributions. 

The familiar t distribution is really a special case of a 
broader class of distributions called "noncentral" distributions 
(Thompson, 2001). Typical inferential techniques are based on 
central distributions like the t distribution (Cumming & Finch, 
2001). Central t distributions are a family of distributions 
based on degrees of freedom. The central t distribution is 
symmetrical, centered at zero, and approaches the standard 
normal distribution as sample size increases (Thompson, 2001) . 
Central t distributions arise when a normally distributed 
variable with a mean of zero is divided by an independent 
variable closely related to the % 2 distribution (Cumming & Finch, 
2001). In contrast, noncentral t distributions arise when a 
normally distributed variable with a mean not equal to zero is 
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divided by an independent variable closely related to the % 2 
distribution. 

Noncentral t distributions are not necessarily symmetrical 
and the degree of skewness depends on the noncentrality 
parameter (A) . The noncentrality parameter is the distance by 
which the mean of the normal distribution is displaced from zero 
and can be estimated by multiplying d by the square root of n 
(Cumming & Finch, 2001) . The noncentral t distribution is 
centered at approximately A, especially if A is small and the 
degrees of freedom is large (Cumming & Finch, 2001) . When A is 
equal to zero, then the noncentral t distribution is exactly the 
same as the familiar central t distribution (Thompson, 2001). 

Properties of the noncentral t distributions are 
illustrated using the ESCI software (workbook Noncentralt) . 
Figure 1 presents a comparison of central t distributions 
(pictured on the left) with noncentral t distributions (pictured 
on the right. 



INSERT FIGURE 1 ABOUT HERE 



The ESCI program allows for manipulation of degrees of freedom 
and the. noncentrality parameter (A) to show changes in the shape 
of the distribution. The following summary, based heavily on 
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Cumining and Finch (2001), identifies some of the properties of 
noncentral t distributions that are illustrated with the ESCI 
program: 

• When the noncentrality parameter is equal to zero, 
noncentral t distributions are exactly the same as central 
t distributions, which means symmetric with a mean of zero. 

• The noncentral t distribution is centered at approximately 
the value of the noncentrality parameter except for small 
degrees of freedom. 

• The shape of noncentral t distributions will always be 
skewed unless A=0. 

• For a particular noncentrality parameter, as the degrees of 
freedom increase the shape of the curve is less skewed and 
more closely resembles central t distributions in shape. 

, • For a particular noncentrality parameter, as the degrees of 
freedom decrease the shape of the curve becomes flatter and 
more skewed. 

• As the absolute value of the noncentrality parameter 
increases so does the extent of skewness associated with 
the distribution. 

• The noncentrality parameter can be positive or negative 
with the outward tail larger with larger noncentrality 
parameters. 
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Confidence Intervals 

Fidler and Thompson (2001) stated that more accurate 
definitions frame a Cl as one interval from an infinite or at 
least large sample of CIs for a given parameter in which l-a% 
(often 95%) of the intervals would capture the population 
parameter. CIs give a best point estimate of the population 
parameter of interest and an interval about that to reflect 
likely error-the precision of the estimate (Cumming & Finch, 
2001) . It is not certain if the confidence interval includes 
the true value of the parameter of interest unless the 
confidence level equals 100%. The width of the Cl depends on 
the confidence, or probability, level. All else constant, large 
probability levels result in wider CIs. The width of the Cl 
also speaks to the precision of the results. Smaller CIs 
usually infer more precision and less error associated with the 
results and therefore more confidence with the results. 

CIs for Cohen' s d 

Constructing CIs for Cohen' s d is not straightforward and 
requires the use of noncentral distributions. This is because 
estimating the Cl involves a ratio of two parameters, mean (jj.) 
and standard deviation (ct) , which changes as either (j. or a 
change, and not simply for a single estimate such as (j. using a 
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fixed estimated a. This cannot be done by simply pivoting a 
probability statement (Cumming & Finch, 2001). 

Calculating the Cl for the standardized effect size d 
first involves constructing Cl for the noncentrality parameter. 
Since there is no formula that can directly give the upper and 
lower bounds for the Cl of the noncentrality parameter it is 
necessary to use computer software that uses an iterative 
algorithmic search to estimate these boundaries (Cumming & 

Finch, 2001). Once the upper and lower bounds for the 
noncentrality parameter are identified the Cl for d can be 
calculated by dividing the upper and lower bounds by the square 
root of n. 

ESCI for One and Two Group Designs 

ESCI (Cldelta) is an interactive program that allows the 
user to calculate and display CIs for standardized effect sizes. 
Figure 2 illustrates a confidence interval for Cohen' s d for a 
single group (i.e., single sample t- test). Hypothetical data 
( n = 30) were entered into the left side of the spreadsheet and 
the program calculated the mean and standard deviation as seen 
near the top of the Figure. The observed sample distribution is 
given at the top of the plot area and a scale of possible 
Cohen's d values is presented at the bottom of the plot area. 
With a "nil" null hypothesis of p = 0 (which can be changed in 
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the program), the observed Cohen's d was 5.013, with t( 29) = 
27.46 and a p value practically zero. 

Using the iterative process based on noncentral t, the Cl 
is generated around the observed Cohen' s d. The Cl bar is seen 
in the plot area of Figure 2 and give a range of 3.677 to 6.341 
(with a 95% confidence level). As noted, this bar would become 
wider if we were to use a 99% confidence level. This graphic 
informs us not only regarding the sample estimate of d, but also 
our level of precision in estimating it. The amount of useful 
interpretive information here is much greater than a single p 
value or even just the d effect size. 



INSERT FIGURE 2 ABOUT HERE 



Figure 3 presents results for a two sample design (i.e., 
two sample t-test) . Again, hypothetical data were entered in 
the left side of the spreadsheet for Group 1 and 2 (n x = n 2 = 

30) . Means, standard deviations, and CIs for the means are then 
calculated. The mean difference and the Cl for that difference 
are also presented. In the plot area, one can visualize the 
distributions, and the 95% CIs (this level can be changed) for 
the two means. The scale for the mean differences is provided 
and illustrates the possibility of both positive and negative 
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mean differences. Finally, the d scale is at the bottom of the 
plot area and the Cl for the observed d of -0.501 is given as 
-1.013 to 0.0149. Of course, with a "nill" null of a zero mean 
difference, this effect is not statistically significant at the 
95% confidence level, t(58) = -1.942, p = 0.057. 

INSERT FIGURE 3 ABOUT HERE 



The following conclusions, based heavily on Cumming and 
Finch (2001), are illustrated in the one and two case designs as 
depicted in Figures 2 and 3: 

• The upper scale on the plot area shows the Cl for the 
population mean p. 

• The Cl for the population mean is symmetric and centered on 
the observed mean. 

• The lower scale on the plot area shows the Cl for the 
population standardized effect size 8. 

• The Cl for the population standardized effect size 8 is not 
symmetric about the mean d due to the need to use 
noncentral t distributions. 

• The standardized effect size measures a standardized 
distance from a particular chosen reference |i 0 • If M-o is 
changed, there will be no change for Cl about the 
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population mean but there may be a noticeable change for 
the Cl about 8. 

• The d scale depends on the placing of the which sets the 

zero for the scale. It also depends on the standard 
deviation for a particular sample. 

• The standard deviation for a particular sample sets the 
unit on the lower d scale. 

• If different independent samples (with different data) were 
identified and compared to the illustrated one and two 
group designs, it would be expected that the CIs would be 
different, the standard deviation for the sample would be 
different, which would affect the units on the d scale 
which in turn would make the Cl for 8 different. 

Summary 

This paper provided an introduction to CJs about 
standardized effect sizes. A number of basic concepts including 
the use of noncentral t distributions were reviewed. The ESCI 
software program was used to simulate and illustrate CIs about 
single and two independent group designs. CIs around effect 
sizes (d and otherwise) can be extremely useful in both (a) 
result interpretation and (b) meta-analyt ic thinking that places 
our single point estimate in context regarding the population 
parameter. 
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Figure 2. Confidence interval for d for single group. 
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